Music has an incredibly fascinating effect on people. Humans are one of the few, if not the only, animals that can naturally feel rhythms. When music is played, it only takes moments for people to clap along to the beat of the song. Specific genres and styles of music have defined generations and cultures time and time again. It’s only natural for music to be so powerful for people.
This notion has picked up much attention in the media. In fact, ABC Science released a video a few years ago ( watch it here ). discussing the power of music on the brain, specifically for people with dementia and Parkinson’s. In the video, researchers showed how playing music reminiscent of one’s past can trigger memories, even for someone who has forgotten them. They did so by curating specially tailored playlists from the pasts of certain residents at a nursing home. When the dementia patients listened to these playlists, many of them had a sudden shift in mood. Their family members were so surprised to see them “come back to life,” in a sense. These residents would talk about their past and be more open and happy than they had been in months.
Another segment of the video showed how music can help with diseases that affect the motor system, like Parkinson’s. At a human movement lab, a professor has spent years analyzing and trying to solve these movement disorders. One thing she tried was playing music, and it’s incredible to see the change that overcomes her patients. John, one of her patients, suffers from Parkinson’s. The video shows how the debilitating disease limits his natural motor skills and prevents him from having voluntary movements. However, after applying a nice tune with the professor’s music, John is able to not only walk, but dance with a partner.
The last part of the video follows a man named Shane who suffered severe brain damage from a bike accident. Following the accident, he could barely speak, move, and recall anything. Later, he became part of an experiment to see if music could invoke memories for people who had severe memory loss from brain injuries. The results showed that he did as well as people with normal, healthy brains, despite the accident. For example, Shane had trouble recalling a memory from grade school if he was just asked to, but had no issues doing so if a song from grade school was being played.
This video illustrates not only the power in music, but its prevalence in our natural being. It is much more than a form of entertainment, it is part of a complete life. The rhythm and tunes of songs – whether from traditional instruments, natural sounds, or digital playlists – bury themselves in our brains and become an integral part of human life.
Additionally, music is not immune to the various societal trends that can morph and change over time. Since it is a form of art, it generally reflects on what major feelings and emotions are being spread in a given time. The prevalence of music in our lives combined with the artistic essence makes it a strong vantage point when looking at how society changes the way it expresses itself. For example, there are times in the late 1940’s and 1950’s where songs tend to be filled with sad or melancholic lyrics, making for more of a “low-mood” song. This has much to do with the major wars at the time, namely World War II and the subsequent Cold War.
Other trends in music, and art in general, can illuminate these trends. Seeing the trends in how artists express their views of the world provides a look into life at during a given time period. Furthermore, the advantageous point with music is also that the popularity of a song or genre can be tracked. Music therefore not only gives a perspective on what songs and emotions are being produced in a given time, but to what extent each is consumed. So, it is possible to look at both the expressed emotions in music, and how much it resonates with listeners.
Change in music is not limited to different life periods, but is also present in different phases in one’s life. For example, there is a popular practice for pregnant mothers to play classical music for their babies in hopes of making them more intelligent. In other phases of life, teens may be warned to stay away from certain genres of music because it will “rot their brains” or act as a bad influence. In either case, music enters our lives early. Many young musical prodigies are discovered because they begin reacting to music even before they are able to speak or walk.
Our plan for this study relies on a comprehensive data set supplied by the Spotify API. This data encompasses many interesting variables, and there is much to experiment with. It provides a fertile ground for exploration into how music affects humans. Two variables specifically are eye-catching: valence, a measure of the happiness of a song, and explicitness, whether a song contains explicit lyrics or not. There are also variables regarding year published, popularity, acousticness, etc. There is a perceived trend that current music (2010-2020) is becoming more and more explicit and despairing. In fact, this can be seen in Figure 1.1 with a dip in valence after 1975. It may be possible to look at these different characteristics in acousticness, valence, and explicitness to create a model that predicts the popularity of music.
Figure 1.1: Scatterplot of Spotify’s valence score by year
In addition to a study of trends in music consumption, this study aims to take a closer look at correlations between the music dataset and other datasets in mental health and crime. Is it possible that the prevalence in explicit music is rising with crime rates? Is it possible that the decrease in valence score is correlated with an increase in mental health cases? These questions will be discussed in the subsequent sections.
The datasets used in this project required lots of work to clean and prepare for visualization and analysis. The data cleaning and code is outlines in the ATABAS_HUANG_dataCleaning.Rmd file, with short descriptions attached for each cleaning procedure.
Music: The music dataset is made available through the Spotify API. Spotify is an audio streaming service that was launched in 2008 and is now one of the most popular streaming platforms for music and podcasts. Therefore, the dataset is fairly comprehensive and contains lots of information due to Spotify’s status as one of the most popular sites for music consumption.
The data encompasses many interesting variables. Of course there are descriptors such as artist or band name, year of song, genre, and key, but there are many other metrics that require more attention.
These specially catered metrics usually range from 0 to 1. For example Spotify describes the metric for the valence (happiness) of a song as follows: “A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry)”.
Other metrics such as explicitness are described as follows: “Whether or not the episode has explicit content (true = yes it does; false = no it does not OR unknown)”. Whereas danceability is described as: “Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable”.
Crime:
The United Nations Office on Drugs and Crime (UNODC) has many comprehensive datasets that pertain to crime of all different shapes and forms. The organization, as a subset of the United Nations, works to cover pressing concerns related to crime in general, but they focus on tackling and/or preventing crime. The datasets present on their website are grouped by type of crime with subsets for more details. For example, there are differentiations made between sexual violence, homicide, and assault. Further, assault can be analyzed more specifically through the different mechanisms (firearm, sharp object, serious, etc.)
With a dataset of this comprehensive nature, it is possible to make different comparisons by extracting the parts of the data that are of interest. In this case, it was of interest to study the relationship between trends in crime and trends in explicit music. There are many studies that have shown types of music to have a postivie impact (classical music, joyous music). However, there are also claims that media in the form of video games, TV, or music can have an equally negative impact depending on their contents. Along these lines, it may be interesting to explore any relationships between explicit music and crime.
The limitations to this dataset originate more from the collection process. The data are collected from the annual United Nations crime trends survey. This is conducted by the UNODC and takes responses from the government officials in a given country. This method not only relies on countries reporting to the U.N., but also relies on local authorities within countries like the U.S. and their reports to the federal level. There may be differences in reporting crime when compared to different countries, because each country can have different laws regarding different crimes. Even states or counties within a given country can have different guidelines for recording a crime. This should be kept in consideration when drawing conclusions.
Since all of these datasets were provided in separate files, it was necessary to read the excel files separately and then follow with a merge by year.
The data also comes in a format that is not easy to work with, having a column of rate and count for each year (i.e. rate of crime, count of crime). The years available are from 2010 to 2017.
The data from UNODC will be used to study
World Happiness:
The first World Happiness Report was released on April 1, 2012 as a foundational text for the UN High Level Meeting: Well-being and Happiness: Defining a New Economic Paradigm, drawing international attention. The report outlined the state of world happiness, causes of happiness and misery, and policy implications highlighted by case studies. In 2013, the second World Happiness Report was issued, and since then has been issued on an annual basis with the exception of 2014. The report primarily uses data from the Gallup World Poll.
The rankings of national happiness are based on a Cantril ladder survey. Nationally representative samples of respondents are asked to think of a ladder, with the best possible life for them being a 10, and the worst possible life being a 0. They are then asked to rate their own current lives on that 0 to 10 scale. The report correlates the results with various life factors.
In the reports, experts in fields including economics, psychology, survey analysis, and national statistics, describe how measurements of well-being can be used effectively to assess the progress of nations, and other topics. Each report is organized by chapters that delve deeper into issues relating to happiness, including mental illness, the objective benefits of happiness, the importance of ethics, policy implications, and links with the Organization for Economic Co-operation and Development’s (OECD) approach to measuring subjective well-being and other international and national efforts.
Data is collected from people in over 150 countries. Each variable measured reveals a populated-weighted average score on a scale running from 0 to 10 that is tracked over time and compared against other countries. These variables currently include:
Each country is also compared against a hypothetical nation called Dystopia. Dystopia represents the lowest national averages for each key variable and is, along with residual error, used as a regression benchmark. The six metrics are used to explain the estimated extent to which each of these factors contribute to increasing life satisfaction when compared to the hypothetical nation of Dystopia, but they themselves do not have an impact on the total score reported for each country
In the above code, I have changed the variable names in each dataset so they should all (mostly) match with each other. Then, I combined all the datasets into one big “happiness” dataset.
Naturally, there are types of music that go together. For example, it is unlikely that you would listen to a slow, acoustic song in a nightclub. Likewise, it is unlikely that “sad” songs are going to have lots of beat drops and high energy/electric music. All of these relationships can be explored with the Spotify data set and its many features. The pairs plot in Figure 3.1 explores these relationships and correlations.
Figure 3.1: Pairs plot for the variables in the Spotify dataset
By looking at the pairs plot, it is possible to see that acousticness and energy are inversely related, and pretty strong at that (refer to plot with Figure 3.1, r = -0.967***). There are many other relationships to explore later when building a model. However, another promising relationships is between the loudness and popularity parameters. The songs that are louder also have a tendancy to be more popular.
Unfortunately, this process is very computationally taxing to run, so it was necessary to use the “spotify by year” dataset. This takes averages for the years, and does not have explicitness. Therefore, this pairs plot neglects the explicitness variable and each point is a year, not an individual song.
The shiny app below takes a closer look at each of the relationships in the above pairs plot (Figure 3.1). Each relationship can be plotted in a scatterplot both with and without a line. The scatterplot.
Aside from the various scatterplots, it is interesting to see that a lot of the variables have bimodal distributions. Since the plots come from a long and versatile timeframe (an entire century), this may be an effect of changing taste over time. Changes in music usually reflects society’s expressed mood (both in terms of production and consumption), so this is not a surprise. This is further supported by Figure 1.1, since the valence (or happiness) values fluctuate over time. This is especially clear in the transition from the “roaring 20s” to the 40s and 50s. The detrimental effects of World War II can even be seen in music!
Before we begin combining the Spotify dataset with other datasets, it’s important to look at those datasets individually. For example, how does the World Happiness dataset change over each year?
Figure 3.2: Map of World Happiness from 2015 - 2020
The world map above, in Figure 3.2, is an interactive map that displays the change in world happiness in countries over the years from \(2015\) to \(2020\). (Hover over countries to see more specific information.)
Figure 3.3: Density Ridges Plot of Happiness by Region
The plot above shows a density plot of the happiness scores across different regions of the world. Note that some countries are included twice (ex: in Southeastern Asia AND in South Asia), as well as the fact that \(2017\), \(2018\), and \(2019\) do not split countries into regions, so the happiness score is not plotted for those years. The most noticeable trends are that Western Europe, North America, and Australia and New Zealand tend to have the highest happiness scores.
Figure 3.4: World Happiness from 2015 - 2020
Figure 3.4 shows the mean happiness of the world over time. The range is typically from \(0\) to \(10\), where \(10\) would be the happiest, so it’s important to note that while there is an increase in happiness after \(2017\), the average only goes up by about \(0.125\) points.
Continuing with the exploration, it is time to look at the crime data. The crime data comes from the United Nations Office on Drugs and Crime and depends on reportings from each country. This can create a discrepancy because each country has different considerations and guidelines on crime which will dictate if a specific incident enters their records or not.
By plotting each type of crime over the years and by the region in Figure 3.5 it is possible to see that certain types of crime are more prevalent or less prevalent by region, and there are changes over the years. The changes over the years are present in some regions, but not in others. For example, Europe does not have much fluctuation in crime rates, whereas the Americas show a decline in Assault from 2010-2017. Additionally, Oceania is higher in Assault and Sexual Violence but lower in Robbery.
Figure 3.5: Rates of Crime by Year and Region
Looking at this data split by region allows for us to see that not every region is the same, and that there are different trends. However, this idea also exists in specific countries of a given region. When viewed as a whole as in Figure 3.6, the global trends on assault and robbery do not have much of a change, but there is an increasing trend in sexual violence. It may be true that sexual violence is increasing, but it may also be true that it is getting reported more. This may be in changes with the laws, or re-classifying types of assaults as sexual violence. In any way, there is an increase in sexual violence crimes according to the UNODC data.
Figure 3.6: Rates of Crime by Type and Year
It is known that reporting changes over time and by country, however, withstanding these limitations, it is still beneficial to look at what is made available at hand. The shiny app allows the viewing of country crime rates, filtering by crime type, region, and subregion.
Countries in similar geographic regions can be similar to a great degree. The U.S.A and Canada speak the same language and have much in common. The same can apply for Spanish-speaking countries in South America. However, each country can also be different in many ways, culturally, linguistically, and in terms of crime.
The shiny app below illustrates this notion, since there are countries of high violence rates that are grouped with countries that have lower violence rates (see, for example, in the shiny app with Assault, Americas, Northern America). Bermuda has a relatively much higher crime rate in comparison to Canada and the U.S.A. and this can drive the mean rate of crime for the Americas to go up.
The shiny app makes it possible to see trends in countries of certain regions and how they compare to each other. For example, it is clear that Hungary and Czech Republic have the highest rates of assault in the Eastern European regions. This type of observation can be made with a few other countries as well. Each region has a separate country that stands out.
Also, the lines through the points show that not all countries have data for every year. This can be a limitation, however, the methods used later will make use of the general trends and not data per country. The UNODC does not received a response from every country in a given year, so it is not feasible to assume that there are not missing informations.
(This is the same EDA graph from earlier, but we limited the years to just \(2015\) to \(2020\) because these are the only years available in our World Happiness dataset.)
In Figure ??, the music and happiness datasets were combined. In doing so, this means that each country now has one valence value, and one (mean) happiness score. There does appear to be a trend here: as the valence score increases, the mean happiness score also increases. However, this is a slightly inaccurate representation of, at the very least, happiness score because each country has different happiness scores.
Figure 4.1: Box plot of valence scores overlayed with points from the dataset
In Figure 4.1, each individual country’s happiness score was plotted against a valence value with boxplots overlayed on top. The plot is interactive to display the country name, happiness score, and year of each point. Since the music dataset does not have a country variable, each year/country is still only associated with one valence value. Therefore, valence was treated as a categorical variable in the above plot in order to overlay the boxplots on top of the points, so the distance between the valence values are not proportionate. Though changes in the plot are very minute, it’s visible to see in the last four boxplots that the median happiness score, as well as the 75th quantile, increases as the valence increases.
In Figure 3.5, there was evidence that crime changed over time, even if there was only a slight change. Now, comes the question whether this change was also paired with the changes in explicit music over time. In order to start the comparison, the percent explicit music per year was calculated. This serves as a general indicator for how prevalent explicit music was in a year. Percent explicit music increases continually in the selected time frame from 2010 - 2017, as indicated by the darkening shade of the colors in Figure 4.3 and the plot below (Figure 4.2.
Figure 4.2: Percent changes in explicit and non-explicit music over time
The reason for the unique spread of the data in the form of horizontal lines is due to the percent explicit scores. Each point on the plot is a country, and each country can have its own crime rate in a given year. However, every country in a given year would have the same percent explicit score since the music data is available on a whole and not on a country-by-country basis. For example, Burundi can have a crime rate different from Sudan in 2010, but they would both have the same percent explicit score.
Figure 4.3: Comparing rate of crime (for 100,000 population) and percent explicit music per year for 2010-2017, displayed by region for each type of crime: assault, robbery, and sexual violence
There were a few outliers, so the Log Rate of Crimes were used instead of the Rate on its own. The logistic regression was used since Explicit is a binary variable, but also because we opted to use percent explicit music. The logistic regressions do not show a great amount of relationship between the changes in Rate of Crime and explicit music. The strongest indicator is in Sexual Violence, which makes sense since this type of crime had the greatest increase.
Starting with the Spotify dataset and its various variables, it was possible to look at how music changed over time. Changes in music and happiness trends can be indicative of how the world is changing. The epochs associated with World War II had low valence (happiness) scores, and other epochs of relative peace and comfort had happier music.
Looking at the spotify dataset it was possible to see more than variations across time and look into how variables such as acousticness and energy were negatively related. Other positive relationships were with variables such as loudness and popularity.
Looking at the different trends of music, we thought that it may be interesting to compare these trends to happiness and crime. Music has a strong impact on us as people, so it may reflect on trends in happiness or crime.
The happiness dataset came from the World Happiness Report which assigns a score for each country from 0 - 10 and also ranks the countries in order from most happy to least happy. Many factors are used, including, but not limited to, health, government, and major socio-economic conditions. Data from 2015-2020 was used and compared with valence from the Spotify dataset.
Over the different valence scores across the years, there did not seem to be a strong change in the median happiness score overall. There may be more to study in this, and it may be better to relate this data by region or some other subset of the global data. The reason for this is because countries that may have experienced major increases in happiness can be masked by the constantly low-scoring countries in these reports. It is notable that the past three years have seen both increases in valence and happiness scores.
The association between crime and explicit music resulted in a similar message. Although, by exploring the crime data, it was possible to get a sense of how the different crimes changed over time. Sexual violence seems to be increasing over the years while the other types of crime, assault and robbery, are in stagnation or slight decrease. Of course, this changes by region of the world, however, on the whole, these are the trends.
Explicit music, on the other hand, is definitely on the rise. We were interested in looking into whether this social expression of increased explicitness was connected to any rises in crime. The logistic regression model did not do much in proving this, and did not have a strong relationship. The only positive relationship was with sexual violence. One takeaway is that sexual violence rates are on the rise, as well as explicit music. These two items, even if they are not connected strongly to each other, are both becoming more and more pervasive in our daily lives.
It may seem spurious to correlate music and crime. As always, corellation is not causation, but it is important to understand that music is a form of art, and societal tendencies can be expressed in the art of its time. As artists and people reflect on what is going on around them, they will produce or consume different types of music that may resonate with them and the current state of society at the time.
We used these three datasets to explore any trends, comparisons, and visualizations that may have meaning and insight. Lots of our visualizations have insights on how each set changed over time, and how these changes reflect between the datasets.